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Abstract. We study the force-induced unfolding of random disordered RNA or single-stranded DNA poly- 
mers. The system undergoes a second order phase transition from a collapsed globular phase at low forces 
to an extensive necklace phase with a macroscopic end-to-end distance at high forces. At low temperatures, 
the sequence inhomogeneities modify the critical behaviour. We provide numerical evidence for the uni- 
versality of the critical exponents which, by extrapolation of the scaling laws to zero force, contain useful 
information on the ground state (/ = 0) properties. This provides a good method for quantitative studies 
of scaling exponents characterizing the collapsed globule. In order to get rid of the blurring effect of thermal 
fluctuations we restrict ourselves to the groundstate at fixed external force. We analyze the statistics of 
rearrangements, in particular below the critical force, and point out its implications for force-extension 
experiments on single molecules. 
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1 Introduction 



The last years have whitnessed a lot of progress in the ex- 
perimental study of force-induced unfolding of biomolecules 
using techniques such as atomic force microscopy and opti- 
cal tweezers. A variety of experiments on RNA and single- 
■ or double-stranded DNA have been devised Jl|,||,||,^,p|,^|, 
|?J to study the behaviour of these polymers under an ex- 
ternal force which allows to determine their elastic prop- 
erties and provides new insight into the folding problem of 
biomolecules. In particular, these techniques offer a means 
to investigate the energy landscape and folding pathways 
and to extract specific thermodynamic parameters. In this 
paper, we will study the force-induced unfolding of RNA 
or single-stranded DNA, with special emphasis on the ef- 
fects of sequence heterogeneity (disorder) . 

Several theoretical models to describe molecules un- 
der external forces have been investigated for the case of 
general heteropolymcrs [p ill 

PIS polyelectrolytes 10 



DNA-unzipping ||,0[l|,|l6 



and unfolding of RNA 



I . Within a mean-field treatment of heteropolymers 
W, disorder has been shown to be relevant in the glassy 
low force regime where a random energy model applies. 
In a critical region around the (first order) denaturation 
transition the disorder induces a necklace structure in the 
polymer chain which is intermediate between the globular 
and the fully extended state. The breaking of individual 
globular domains upon increasing the force leads to step- 
like force-extension characteristics |10|. 
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In the unzipping of a DNA double strand sequence, 
heterogeneity has been shown to modify the critical be- 
haviour of the opening phase transition jb3, 14 which has 
been characterized in terms of the statistics of elementary 
unzipping events below the threshold force. 

The case of unfolding of RNA is more involved in 
that the ground state usually has a much more compli- 
cated structure than a single hairpin (the equivalent of 
a DNA double strand) and competes with a large num- 
ber of low-lying metastable states into which the system 
can be driven by the external force. On the other hand, 
for the same reason, the force-extension characteristics 
at low enough forces may reveal more information about 
the energy landscape of the molecule jl7],[ll). In previ- 
ous articles, force-extension curves have been discussed 
for a homogeneous model of RNA Jl{|^0| as well as for 
disordered RNA pl|. The homogeneous description ex- 
hibits a second order phase transition from a collapsed 
globular state to an extended necklace-like phase. We will 
show that the phase transition persists under the introduc- 
tion of sequence disorder. However, at low temperatures, 
the critical behaviour is modified and the globular phase 
becomes glassy which manifests itself in the statistics of 
jumps in force-extension curves. At higher temperatures, 
the homogeneous description has been argued to be a valid 
coarse-grained description of random RNA (20) . 

The extended phase is not very sensitive to disorder, 
and random RNA exhibits rather few specific features in 
pulling experiments . Indeed the fit of the experimen- 
tal data in jl9| based on a homogeneous model is remark- 
ably good. The force-extension curves of naturally selected 
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RNA are usually richer than those of random sequences 
of the same size which probably reflects the bias of natu- 
rally selected RNA towards sequences with a particularly 
stable ground state and favourable folding characteristics 

In this paper we concentrate on the low forces and the 
critical window around the above-mentioned second order 
phase transition. This regime exhibits a lot of interesting 
features that are intimately related to the energy land- 
scape of the molecule. It is in this regime that the most 
complex force-induced rearrangements in the secondary 
structure take place. These are expected to proceed via 
a slow activated dynamics giving rise to long equilibra- 
tion times and ageing or hysteresis effects (see || for re- 
lated experiments done on DNA). The latter are prob- 
ably relevant for the behaviour of large RNA molecules 
such as messenger RNA or heteronuclear RNA. Most cur- 
rent experiments however usually operate at forces com- 
parable to the threshold force to unzip double-stranded 
DNA (w lOpN) which is considerably larger than the crit- 
ical force mentioned above. In sufficiently small molecules, 
these experiments allow for the identification of parts of 
the folding pathway in short molecules |||l7|[^]. The na- 
ture of the events that dominate the force-extension curve 
in the extensive phase is however rather simple in that 
they are transistions between two competing foldings of a 
relatively small domain. The results of this article show 
that the large scale energy landscape should be studied at 
forces in the critical window and at modest chain exten- 
sions, and we give qualitative predictions for the expected 
behaviour of random disordered molecules in this regime. 

The relative smoothness of the force-extension curves 
of RNA hides a large part of the structural transitions in 
the molecule and thus hinders the understanding of the 
underlying processes. The authors of JlJ] trace the origin 
of the smoothness to essentially three factors: i) the ther- 
mal effects in the form of entropic elasticity that naturally 
limit the resolution of any force-extension experiment p2[ , 
ii) the contribution of several competing secondary struc- 
tures with comparable free energy but different extension, 
and, most importantly, iii) the fact that several globules 
are pulled upon in parallel so that opening one of them 
may be accompanied by the re-closure of a neighbouring 
one, smearing out the jumps in extension that one would 
observe from a single globule. 

Since we do not aim at giving an accurate prediction of 
force-extension curves but rather want to analyze the dy- 
namics in the secondary structure, we will circumvent the 
first two smoothing effects by eliminating thermal fluctu- 
ations: Instead of averaging over all secondary structures 
with their appropriate Boltzmann weight (at fixed force) 
we restrict ourselves to the structure with the lowest free 
energy which allows us to get rid of the entropic fluctua- 
tions in the phase space of pairing patterns. Furthermore, 
we characterize the extension of a given secondary struc- 
ture by its fully stretched end-to-end distance rather than 
the extension in real space which is subject to thermal 
fluctuations. In this way, we can analyze the direct signa- 
tures of sequence disorder as a succession of sharp jumps 



in the equilibrium force-extension characteristics. In an 
experiment, they would show up as bistabilities |p|,|6|,[r^| 
that are further smoothed by entropic fluctuations of the 
extension. However, we believe that (at low enough tem- 
peratures) thermal effects do not alter the physics in an 
essential way apart from blurring the data to a certain 
extent. 

The second order phase transition mentioned above 
appears in a new light if considered from the point of view 
of sequence specific response. The critical force separates 
two qualitatively different regimes: At low force, the chain 
folds into very few large globular structures that may re- 
arrange dramatically upon an increase of the force under 
quasi-equilibrium conditions, thus revealing information 
about folding pathways and the energy landscape. On the 
other hand, at forces above the threshold the chain orga- 
nizes into a necklace-like structure with an extensive num- 
ber of small globules linked by unpaired single strands. 
The disorder manifests itself only weakly in this phase 
since the sequence specific response of the individual glob- 
ules is averaged out when the force pulls on an extensive 
number of structures in parallel. In the high-force regime 
the force-extension curves become indeed rather feature- 
less. 

In this paper, we study in detail the critical behaviour 
in the vicinity of the threshold force. We find numerically 
that different disordered models belong to the same uni- 
versality class and derive relations between the critical 
exponents. Extrapolation of the scaling laws to zero force 
allows us to obtain the scaling behaviour of various ob- 
servables in the ground state. In particular, we discuss 
the scaling of the radius of gyration and the implications 
for the mean monomer density in real space. Furthermore 
we characterize the statistical properties of the jump-like 
events in the low-force and the critical regime, emphasiz- 
ing the relevance of these results with respect to single 
molecule experiments. 

The models are introduced in section |^. In section |^ 
we review the properties of homogeneous RNA and intro- 
duce the relevant structural observables. The algorithm 
to obtain the force-extension curve for disordered RNA is 
explained in section ^, and the results are presented in sec- 
tion |5[ Finally, section || concludes with a summary and 
an outlook including experimental perspectives. 

2 Models for RNA 

2.1 Topological rules for the secondary structure 

We will use a simplified model of RNA as in previous 
work |2j|p4|,H|], describing the folding of a sequence of 
N bases {b~i}i=i,...,N by its secondary structure, i.e., the 
list of ordered base pairs (bi, bj) with i < j, whereby each 
base can at most be paired to one other base. As usual 
in the prediction of secondary structure (see e.g. (2(|), we 
do not allow for pseudo-knots, i.e., base pairs (bi,bj) and 
(bk,bi) with either i < k < j < I or k < i < I < j. In 
order to account for the importance of the base stacking 
energy as compared to the covalent pairing energy we do 
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not allow for isolated base pairs that are known to be 
thermodynamically unstable. 

The free energy of a given secondary structure is taken 
to be the sum of independent local contributions, but in- 
stead of using the extensive set of rules established by 
Turner's group 27 (as has been done in we study 
two simplified models which we believe to reproduce the 
essential features of the energy landscape and the force- 
extension curves in particular. 



2.2 The Models 



The Gaussian model assigns a random pairing energy e 
to each base pair (bi,bj) where the e^'s are indepen- 
dent Gaussian variables with zero mean and unit vari- 
ance. Studies of the glass phase with such a model have 
shown that it captures well the relevant features of more 
elaborate models. However, due to the absence of detailed 
microscopic constraints and the continuous distribution of 
pairing energies the model approximates fairly well the be- 
haviour of real disordered RNA on a coarse-grained level 
where the bases hi correspond to small substrands of the 
real molecule and the e^.j describe effective pairing affini- 
ties between those substrands. As a consequence, the fi- 
nite size effects are smaller than in a model with detailed 
sequence dependent interactions. 

The four letters model starts from a sequence of letters 
A,C,G,U and assigns a stacking energy to each couple of 
adjacent base pairs where only Watson-Crick (A-U and 
G-C) and wobble pairs (G-U) are allowed. We have taken 
the thermodynamic values used in Zuker's algorithm p8| 
at 37°. The model differs from the full set of Turner rules 
mainly by the neglect of cntropic costs for hairpin and 
internal loops and bulges. The inclusion of these terms 
would re-shuffle the low-lying states without altering the 
qualitative response to an external force. We therefore ex- 
pect the results of the four letters model to be represen- 
tative for real molecules. 

Both toy models behave qualitatively in the same way 
but the finite size effects are stronger in the four letters 
model [and there is a (trivial) bunching of jump events at 
higher forces due to the discreteness of the stacking ener- 
gies which does not occur in the Gaussian model with con- 
tinuous energies]. For most of the subsequent discussion 
we will present the results of the Gaussian model. How- 
ever, we will refer to the more realistic four letters model 
when we discuss signatures of specific single molecules. 

Similar toy models have been considered by many au- 
thors to study the low temperature behaviour of RNA 
|^,^9[|3^,|2^,0. There is wide agreement that at suffi- 
ciently low temperatures the sequence heterogeneity leads 
to a glassy phase where the molecule is trapped within a 
rugged energy landscape whose precise character is still 
not very well understood, however. 

We note in passing that there could also be a glass 
transition on the level of the tertiary structure, i.e., the 
arrangement of a given secondary structure in real space, 
as has been predicted in |32j|. Of course, this aspect of the 



energy landscape cannot be captured by our description 
restricted to the secondary structure. 



2.3 RNA under tension 

An external force couples to the system by adding a term 
— f • (rjv — i"i) to the total energy. The end-to-end distance 
of the molecule rjv — i"i can be decomposed into a sum of 
contributions from the globules in the chain and from the 
single strands linking them. We keep the description sim- 
ple by adding an energy — fl for each backbone element in 
a linker strand and for each globular structure within the 
free part of the chain, see Fig. [j]. Thereby, / denotes the 
(mean) projection of the chain elements onto the direc- 
tion of the force. This description eliminates by hand the 
thermal fluctuations due to entropic elasticity. By treat- 
ing the closing bonds of the globules on the same level as 
the linker elements we most probably underestimate their 
contribution to the free length, but we do not expect the 
physics to depend crucially on these details. 

Finally, the total energy of a secondary structure S 
reads 



E N (S) = 



^ ^ ^pair/stack 
pairs or stacks 6<S 



fl-n(S) (1) 



where n(S) is the extension of the chain in the direction 
of the force, see Fig. 0. 



3 Review of homogeneous RNA 



We briefly review the results [|33j,|2C|] obtained in the frame- 
work of a homogeneous description of RNA where all pair- 
ing or stacking energies are independent of the interacting 
bases. This model applies to homogeneous polymers (pe- 
riodic self-complementary series AT AT ... or GCGC . . .) 
as well as to disordered sequences at high temperatures 
on a coarse-grained level. For details we refer the reader 

to a 



3.1 Radius of gyration 

The radius of gyration R g of the molecule is an important 
and experimentally accessible observable. Its leading de- 
pendence on the length N of the molecule can be derived 
most easily from the mountain height representation of 
the secondary structure see Fig. |]. 

The height of the mountain above the linear sequence 
indicates the number of bonds that are crossed when one 
follows the helices and internal loop in the secondary struc- 
ture from the free part to the considered base. The aver- 
age height h is proportional to the typical distance be- 
tween peripheral bases within the secondary structure. If 
the latter is not too dense in real space one may assume 
that the helices and loops essentially follow a random walk 
in space, and the radius of gyration is expected to scale 
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Fig. 1. The standard representation (top) of the secondary 
structure and the corresponding mountain height representa- 
tion (bottom): Up- or downward steps correspond to bases that 
are paired down- or upstream in the sequence, respectively. The 
height of the mountain above a given base equals the number 
of bonds to be crossed when linking it to the free part of the 
chain (bases at the bottom of the standard representation) fol- 
lowing helices and internal loops, as indicated by the arrow. 
The average height h of the mountain is a measure of the typi- 
cal distance between peripheral bases. The extension n(S) of a 
secondary structure is taken to be the length of the free part. 
This equals the number of backbone elements and globules the 
free part contains. 



3.2 Distribution of base-pair lengths 

The mountain height is closely connected to the distribu- 
tion of base pair lengths (defined as I = j — i + 1 for a 
base pair (bi,bj)). In the homogeneous case, the proba- 
bility distribution P(l) ~ Z~ 3 / 2 (cut off on a scale O(N)) 
derives directly from the partition function of a molecule 
with N bases that scales as Z N ~ (~ N /N 3/2 H,||. The 
distribution of base pair lengths will be modified in the 
presence of disorder since there will be a tendency to pair 
bases over longer distances in order to take advantage of 
favourable pairing energies. This will result in lowering the 
exponent 3/2. 

The scaling of the average height can be derived from 
the distribution P(l) by noting that the total area under 
the mountain can be written as the sum over the length 
of each base pair, 

, N N , 

(2) 

3.3 Critical behaviour under an external force 



as R g ~ h . The thermal average of the typical moun- 
tain height scales like iV 1 / 2 since the ensemble of possible 
mountains is in one-to-one correspondence to a random 
walk in a half plane, constrained to return to the origin 
after N steps (possibly with a constant energy gain per 
up/down step) whose typical excursion is well-known to 
scale like 7V 1/2 . 

For the homogeneous model we thus deduce R g ~ 
N 1 / 4 which leads to a monomer density in three-dimensional 
space scaling like N/R 3 g ~ TV 1 / 4 . As pointed out in f33|,|o| 
this result can only be valid for small enough N since for 
large molecules excluded volume effects become essential 
and cannot be disregarded in the discussion of secondary 
structure. This makes the usual approaches to RNA fold- 
ing questionable in the case of large molecules since they 
are all based on the assumption that tertiary interactions 
can safely be neglected when the secondary structure is 
determined. However, we will see below that sequence dis- 
order reduces this problem in that it increases the typical 
base to base distance in the molecule just sufficiently to 
avoid too high monomer densities in real space. 

Finally, let us introduce the notion of the hierarchy 
level of a base within the secondary structure which we 
define as the number of multi-branched loops (loop junc- 
tion of at least three stems) crossed when following the 
secondary structure from the free part to the base. It in- 
dicates how deep in the tree-like structure of the pairing 
pattern the base is located. It will be a useful quantity 
to characterize the size and importance of force-induced 
rearrangements . 



Let us now consider the coupling of a homogeneous molecule 
to an external force. Up to a critical force f c the molecule 
is in a globular state and its extension remains very small. 
At high forces the molecule is extended, the number of 
bases in the free stretched part being proportional to N. 
The number of globules is extensive as well, and for forces 
near to the threshold it is essentially proportional to the 
extension since the mean length of the linker strands in 
between is only weakly force-dependent. As discussed in 
[^9[|2^] a second order phase transition takes place at the 
critical force, and both the extension and the height obey 
a scaling law of the form 

h = N 1 ^ h [N 1/2 (f^f c )} (3) 

and 

£ = N 1 / 2 ijj c [N 1/2 (f~fc)] (4) 

as is obtained from a careful analysis of the partition func- 
tion of a homopolymer under tension, cf., |2fJ].The scaling 
argument of the functions iph and tpc indicates the exis- 
tence of a correlation length that takes the form 

N c ~(f- fc)- 2 - (5) 

It is related to the size of the largest globules in the chain 
in the sense that a finite fraction of all bases belongs to 
globules of size N g i D b > N c . 

The critical force results from a competition between 
the energy gained from the external force upon increas- 
ing the free length of the chain and the corresponding 
free energy loss. While the latter is of almost purely en- 
tropic origin in the homogeneous case (since the extension 
can increase without reducing the number of paired bases 
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and thus without energy cost) there is also an enthalpic 
component to be included for disordered models. (In our 
effective zero temperature study it is even the unique con- 
tribution.) Using a Harris- type criterion [^o| . one can show 
that the disorder will only marginally affect the correla- 
tion length and the corresponding exponent will remain 
the same, i^ dls = t/ purc = 2. At high temperatures, the 
disordered models turn out to be in the same universal- 
ity class as homogeneous RNA, while at low temperatures 
disorder modifies the other critical exponents and renders 
the globular phase glassy. 



4 Disordered RNA 
4.1 Observables 

Since in a homogeneous chain a lot of energetically equiv- 
alent states exist that can be smoothly transformed into 
each other in phase space (by sliding the base pairing pat- 
tern) the secondary structure will respond to an applied 
force in an essentially continuous manner. However, the 
force-induced unfolding of disordered RNA proceeds in 
stepwise rearrangements that occur at well-defined forces 
when thermal smearing is neglected. We will describe these 
jump-like events statistically, averaging over all events in 
a narrow force window. The density of jumps, i.e., the fre- 
quency of occurrence per unit of force is small below the 
critical force since the polymer is trapped in metastable 
states, but it becomes extensive in the high force regime. 
The relevant characteristics of rearrangements are their 
sizes as given by the number of bases that change their 
pairing behaviour and the depth that is best captured by 
the largest hierarchy level involved in the rearrangement. 



4.2 Numerical methods 



The force-extension curves of random RNA were obtained 
numerically by determining all jump events, i.e., the set 
of forces at which the lowest free energy state changes, 
together with the corresponding secondary structures. To 
this end, we first solved recursively the folding problem 
restricted to the substrand from bases i to j with the 
standard 0(N 3 ) algorithm introduced by McCaskill pq , 
|3r| , that yields the ground state energies E (i,j) and the 
corresponding configurations (see Fig. g). In a second step, 
we determined the ground state Ei,(j;n) of the substrand 
from bases 1 to j constrained to contribute n elements to 
the free part of the chain (with n < j). This can be done 
using the recursion (see |18[ for a similar algorithm) 



Ehij] Ti > 2) = min | 



= mm -; mm 

k— n,...,j — 2 



E L (k-l;n-2) 

+ e ktj +E (k + l,j-l] 
E L (j-l;n-l)}, 



starting from the initial 

E L (j;0) 

E L (j >2;l) 



conditions 
= 0, 

= E (1,2), 

= e ld + E (2,j-l). 



(7) 
(8) 



1 l-.l h-H r 


j-i j 


i k-1 k k+1 


i-i 


(n-l) 


(n-2) ,' 







k-l k k+1 



Fig. 2. Schematic representation of the recursions over sub- 
strand lengths used to calculate the ground state energy 
Eo(i,j) of the substrand from base i to j (white boxes, top), 
and the ground state energy _E_l(j';ti) of the substrand from 
base 1 to j constrained to have extension n (black boxes, bot- 
tom). 



In order to keep the explanation of the algorithm sim- 
ple the restriction that isolated base pairs are not al- 
lowed has been dropped here and we only treated the case 
of pairing interactions (as in the Gaussian model). Both 
stacking energies and the exclusion of isolated base pairs 
can be taken into account by a minor modification of the 
algorithm. 

The lowest free energy for fixed extension n in the 
presence of force is given by El(N] n) — f ■ n. The exten- 
sion uq at vanishing force is the value of n that minimizes 
El(N;ti). The forces fa at which jumps of the ground 
state configuration occur, together with the respective ex- 
tensions rii, are iterativcly determined from 



f i+ i = min 

n i+1 >n, 



E L (N;n i+1 ) - E L (N;m) 
n i+ i - n, 



(9) 



(0) 



fii+i being the argument that minimizes the right hand 
side. (In the case of a degeneracy we chose the largest 
possible n i+1 .) 

Once all force intervals and the corresponding lowest 
free energy states were known, we averaged the single state 
observables (extension, height, maximal hierarchy level) 
over all states in a window of width Af = 0.01, while 
the characteristics of jump events (density, size, hierarchy 
level involved) were averaged over all events in a window of 
width Af — 0.05. Finally, a quenched average over about 
1000 samples of fixed lengths in the range N = 200- 2000 
was performed. 



5 Results 

5.1 General properties of globular and extended phase 

In Fig. U we plot the extension per base, n(f)/N, as a func- 
tion of the force for the Gaussian model. Above the critical 



6 



M. Muller et al.: The secondary structure of RNA under tension 



s 



0.1 





N=600 
N=1000 
N=1600 
N=2000 



0.6 



0.8 



f 



Fig. 3. Extension per base as a function of the force (Gaussian 
model, average over 1000 samples). The extension in the high 
force regime is clearly seen to be extensive. The inset shows 
the total extension of the chain in the low force regime. There 
the extension is very small and the chain typically does not 
contain more than 3 globules. 




Fig. 4. Frequency of occurrence of rearrangements per unit of 
force, divided by N. In the high force regime the rearrange- 
ments occur locally and effect mostly single globules. The as- 
sociated jump forces are uncorrelated and thus the frequency 
of events is proportional to the number of globules and thus 
to N. The inset shows the unsealed frequency of occurrence in 
the low force regime. Up to the critical force f c ~ 0.39 only 
few jumps occur on average, their number being essentially 
independent of the size of the molecule. 



force f c w 0.39 the chain contains an extensive number of 
globules, and its free length is proportional to the number 
of bases N. At low forces, the extension is very small and 
on average there are only very few globules in the chain, 
independently of N. This is also reflected in the frequency 
of rearrangements, see Fig. ^ Jumps are very rare below 
the critical force, on average only one or two usually rather 
important rearrangements take place up to / = 0.2. The 
situation changes dramatically above f c where the force 
pulls on an extensive number of essentially independent 
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Fig. 5. Force-extension plot for 10 random sequences of the 
four letters model for N = 400 (left) and N = 1800 (right). 
The magnitude of the sample to sample fluctuation decreases 
clearly when N increases, demonstrating that the system is 
self-averaging in the extensive phase. The continuous line in the 
left plot indicates the sequence- averaged extension, while in the 
right figure we plot the non-linear critical law C ~ (/— /c) 2 *- 1-7 ' 
to be expected in the thermodynamic limit. 



globules in parallel and a typical rearrangement only in- 
volves a single globule. In a model with continuously dis- 
tributed pairing energies, the globules rearrange or break 
up at well-defined forces that are practically uncorrelated 
among each other. Thus, the frequency of such events is 
expected to be proportional to the number of globules and 
thus scales like N. The release of bases, i.e., the increase of 
extension per jump event is finite and rather small on an 
experimental scale so that the force-extension character- 
istics becomes very smooth. This effect is also at the basis 
of self-averaging of the force-extension curves in the exten- 
sive phase as illustrated in Fig. ||. The sample-to-sample 
fluctuations in the extension per base at a given force de- 
crease with N since the force acts on a large number of 
globules in parallel and disorder effects are averaged out. 

The rearrangements are much more interesting in the 
low force regime and around the critical force. Far be- 
low the critical force a typical rearrangement consists in 
a large globule breaking up in several (up to 7) smaller 
substructures. Thereby, the rearrangement is not just su- 
perficial, that is, it does not only concern the uppermost 
levels of the hierarchical tree structure but typically in- 
volves a hierarchy level that grows like AT - 5 as we will see 
later. It reaches an average of 8 for N = 1600 at low forces 
in the Gaussian model which means that (at least) a cas- 
cade of 8 successive helices has to open up (and close dif- 
ferently) during the rearrangement. This involves a com- 
plicated pathway in the space of secondary structures, and 
the equilibration times in this regime of forces will be con- 
siderable. One therefore expects to see slow dynamics in 
real experiments that may reveal interesting information 
on the intermediate stages of the refolding. 

A rearrangement is not necessarily restricted to a sin- 
gle globule but can involve several neighbouring ones. This 
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sort of cooperativity is particularly large slightly below the 
critical force where on average 2 — 3 globules are involved. 
At higher forces the most frequent events are those where 
one small globule is opened up and stretched out. 



5.2 Scaling and universality 

Let us now examine the phase transition and the associ- 
ated critical behaviour in more detail. In analogy to the 
homogeneous case, Eqs. (||) and (0), we expect the moun- 
tain height and the extension to obey scaling laws 



h = N"MN 1/2 (f-fc)} 



(10) 



and 



C = N^c[N 1/2 {f - f c )]. (11) 

Note that we used the same correlation length exponent 
v = 2 as in the homogeneous case as suggested by a Harris- 
type criterion po| . The exponents (3 and 7 may however 
be modified by the disorder. Since forces sufficiently below 
f c are irrelevant for the structural properties such as the 
height h, it follows that the scaling function ^ph{x) tends 
to a constant for large negative values of x. Thus, we can 
extrapolate ( |l0| ) to / = and obtain the scaling behaviour 
in the ground state h(f = 0) ~ N*. The determination of 
j3 via the critical scaling allows for a much better control 
of finite size effects than a simple fit at / = (as was done 
in p0|) where small samples are effected by the vicinity of 
the critical point. 

In Eq. (0) we established a general relation between the 
average height h and the probability distribution of base 
pair lengths, P(l), both in the absence of force. In analogy 
to the homogeneous case, we expect a power law decay for 
the probability of long pairings according to P(l) ~ l~ a , 
from which we derive the scaling of the average height as 
h = 1/iVX^j P(l)l ~ N 2 - a , and hence a = 2-/3. Since 
the disorder favours distant pairings in order to gain en- 
ergy from particularly well matching substrands, we ex- 
pect a to be lowered with respect to the homogeneous case 
(where a = 3/2). 

In Fig. ^ we plot the sample averaged probability dis- 
tribution P(l). The best fit to a power law yields a w 
1.340 ± 0.003. Scaling plots for the height and the ex- 
tension, with optimized values for f c , 7 ps 0.71 ± 0.02 
and /3 w 0.67 ± 0.02 are shown in Figs. [7] and || for both 
the Gaussian (f c — 0.395 ± 0.005) and the four letters 
model (f c = 0.39 ±0.01). Note that the exponent relation 
(3 = 2 — a is very well confirmed by the numerical data. 
After a global rescaling of the axes with model dependent 
factors A and B the curves all collapse on a single scal- 
ing function. This strongly suggests that the two different 
disordered models belong to the same universality class, 
implying that the critical exponents are independent of 
the specific kind of disorder (but clearly distinct from the 
values in the homogeneous case). 

In the critical regime, there are numerous globules in 
the chain, each of which defines a (randomly distributed) 
force where it will rearrange. The frequency of rearrange- 
ments is thus proportional to the number of globules which 
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Fig. 6. Histogram of base pair lengths at / = 0, averaged over 
sequences of length TV = 1600 in the Gaussian model. The 
curve is the best fit to const./ - " which yields a w 1.34. 
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Fig. 7. Universal critical scaling of the extension for both the 
Gaussian and the four letters model. A and B are model- 
dependent constants. The critical exponent is found to be 
7 ~ 0.71. The inset shows a scaling plot of the frequency of 
rearrangements in the Gaussian model with the same critical 
exponent 7. 



in turn is proportional to the extension, at least in the crit- 
ical regime. Therefore the frequency of jumps scales with 
the same critical exponent 7 as the extension, as shown in 
the inset of Fig. [7| 



The scaling ([11]) suggests that in the thermodynamic 
limit the extension grows like £ ~ (/ — J c ) 2( ' 1_7 - ) iV just 
above the critical force, i.e., exhibiting a rather unex- 
pected non-linear response. Experimentally this effect can 
only be seen in very large molecules that are sufficiently 
self-averaging in order not to mask the phase transition 
by sequence specific signals, see Fig. ||. 
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Fig. 8. Universal critical scaling of the mountain height h for 
both disordered models. The constants A and B are the same 
as in Fig. 0. From the extrapolation to large negative values, 
the scaling h(f = 0) ~ N can be extracted with j3 ~ 0.67. 



5.3 Base pair lengths and the radius of gyration with 
disorder 

Following the reasoning in section lO we expect the radius 
of gyration to scale like R g ~ h 11 ~ iV 3 / 2 . This in turn 
gives rise to a monomer density in real space scaling like 
N/R? g ~ iV 1_3/3 / 2 « N°. In contrast to the homogeneous 

model where this density grows as iV 1 / 4 , sequence disorder 
leads to more elongated secondary structures that are not 
increasingly dense. In this sense, disorder re-establishes 
(at least marginally) the self-consistency of the standard 
approach to RNA-folding which neglects excluded volume 
effects (and other tertiary interactions) when optimizing 
the secondary structure. The latter would not be justi- 
fied if the individual side chains came into conflict with 
each other in real space, as would be inevitable for large 
molecules if f3 < 2/3. 

The inequality (3 > 2/3 is only a necessary global con- 
dition to avoid too dense structures. However, it is unclear 
whether it is sufficient to prevent structures with too high 
local densities. In any case, since the global condition is 
just marginally verified in random disordered RNA, we 
expect that excluded volume effects still play an impor- 
tant, if not decisive, role in large molecules and have to be 
taken into account in the secondary structure prediction 
of large molecules. 



5.4 Equality of (3 and 7 

The critical exponents (3 and 7 seem to be equal within 
the error bars. We will argue in favour of their equality by 
considering statistical properties of the secondary struc- 
ture at the critical force. 

At f c the number of globules in the chain is already 
large (~ A^ 7 ) and we assume that the globule sizes L are 
distributed according to a power law P(L) ~ L~ 5 with a 



cut-off on the scale N. This has been confirmed numeri- 
cally with an exponent 8 ~ 1.7. The total number of bases 
in the chain can be estimated as the sum over all globule 
sizes, 



N ^ N' 1 



O(N) 



P(L)LdL ~ N^N 



2-S 



(12) 



which implies 7 = 5—1. 

Let us now consider the largest globule in the chain. 
The scaling with N of the number L max of bases it con- 
tains follows easily from extreme value statistics, 



P{L)dL = 0(N-~<), 



(13) 



or Ni/L 5 ^ = 0(1). Thus, L max ~ N, i.e., the largest 
globule contains a finite fraction of all bases. We recall 
that according to Eq. (10) its height scales as N@. 

For the further discussion, we refer to Fig. [| The path 
through the secondary structure from the top most base in 
the mountain representation down to the free part splits 
the largest globule into two parts. If we imagine to apply 
the critical force separately to those to parts (see the right 
part of Fig. ||) we would expect their new folding to be 
qualitatively the same as in the intact globule. This is 
because the bases marked by the thick lines in Fig. [9] will 
gain an equivalent energy from exposure to the critical 
force as they gained from the pairing to the bases of the 
other strand. This simply reflects the equality of the free 
energy per base in a globular structure and in the free part 
at the critical force. The height of the original globule (~ 
N@) must therefore scale in the same way as the extension 
of one of the two halves subject to the critical force. For 
the latter Eq. ( |iT|) implies h ~ I^ ax ~ N 1 , and hence the 
equality f3 = 7. 

This picture is slightly too simplistic, for one expects 
that a certain fraction of the side structures of the glob- 
ule have a smaller (local) critical force and would open up 
when the substrands are exposed directly to the (global) 
critical force. However, the unzipping of such weakly bound 
side structures will by far not be complete. In a first step, 
the closing helix will be unzipped up to the first internal 
loop, and the subsequent structures attached to that loop 
will become directly exposed to the critical force. Again, 
a fraction of them will be unstable and rearrange. If one 
assumes this fraction to be constant (and not too large) 
on all hierarchy levels, the number of opening base pairs in 
a side structure is finite. We thus conclude that the total 
number of opening base pairs in the complete substrand 
is at most proportional to the number of side structures, 
and thus scales like the height h ~ N@ of the globule. The 
new extension £ of the substrands therefore still scales 
like ~ N@ , though with a larger prefactor. 

We expect finite size effects to be responsible for the 
slight difference in the numerical values found for j3 and 7. 
The partial opening of side structures in the above argu- 
ment, is naturally cut off at the maximal hierarchy level 
in the considered secondary structure. The latter is still 
rather small for the molecule sizes we studied numerically. 
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Fig. 9. A large structure is split into two halves along the 
path from the free part to the most distant base (thick line 
in the left). At the critical force, the cost in pairing energy 
for this splitting will essentially equal the energy gained from 
the external force upon stretching (right part). We therefore 
consider the folding within the full globule to be essentially 
equivalent to the folding of the same strand exposed to the 
critical force. 



The ratio C/h between the extension of the substrands 
and the height of the original globule becomes slightly 
larger when more hierarchy levels are available, but it will 
saturate at sufficiently large sizes. This might be at the 
basis of the numerical trend to find 7 > /3. 
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Fig. 10. Scaling plot of the hierarchy level involved in a rear- 
rangement, averaged over events in a force window Af = 0.01. 
The scaling at / = 0, 7V M , with jj, = 0.5, is smaller than that of 
the maximal hierarchy level in the structure (JV ' 65 ). However, 
it shows that a substantial fraction of the globule is involved 
in a rearrangement. 



5.5 Characteristics of force driven jumps 

Let us now analyze in more detail the single rearrange- 
ments and their statistics. The depth of a rearrangement 
is best characterized by the maximal hierarchy level in- 
volved. We note that the total number of hierarchy lev- 
els in a secondary structure scales in the same way as 
the height (~ N 13 ) since a single level always contains a 
bounded number of bases. 

To find the scaling of the average level involved in 
a jump we again resort to a critical point analysis, see 
Fig. fo, and extrapolate the scaling function to / = 0. 
The scaling is less neat than for other observables since 
finite size effects are rather important. Furthermore the 
statistics of jump events is sparse in the low force regime 
which causes the large error bars. Nevertheless an approx- 
imate increase of the average level as iV 5 can be estab- 
lished, showing that usually a considerable fraction of a 
globule takes part in rearrangements. The latter will be 
far from trivial, giving rise to long equilibration times and 
ageing effects in the low force regime, even in molecules of 
moderate size. 

Another important characteristic of a rearrangement is 
the fraction of bases that change their pairing behaviour. 
This includes the bases that were paired before the re- 
arrangement and are unpaired afterwards, or vice versa. 
This quantity measures a kind of phase space distance be- 
tween the secondary structure before and after the jump. 
We denote it by 1 — q where q is the overlap between the 
secondary structures which bears some resemblance to the 
overlap defined in spin glasses [^3| . 

Macroscopic rearrangements correspond to phase space 
distances 1 — q ~ 0(1). At low forces, the relative probabil- 
ity of occurrence for such events is found to decrease with 
the system size as (1 — q) ~ N~ a , where () denotes the av- 
erage over rearrangement events in a small force window 



and the subsequent average over disorder realizations. The 
exponent a w 0.25 ± 0.05 is relatively small, and thus, in 
RNA of moderate sizes macroscopic rearrangements are 
rather frequent. We illustrate this point in Fig. [ll] where 
we plot the overlap of the secondary structure at a given / 
with the groundstate (/ = 0) structure for five randomly 
chosen sequences in the four letters model. At low force, 
large changes on the scale of the system size are quite 
common, but of course, the behaviour differs a lot from 
sample to sample. The high force regime is far less inter- 
esting, being dominated by the rupture of small globules. 

In the critical regime the changes in the secondary 
structure are also quite dramatic since a lot of rather im- 
portant rearrangements occur in a small force window. 
Indeed, the phase space distance 1 — q(f; f + df) between 
the equilibrium secondary structures at the forces / and 
/ + df (for df fixed) increases approximately as N - 3 pro- 
vided both / and f + df belong to the critical regime (for 
fixed df, this restricts N to values such that Ndf 2 is still 
small). This suggests that hysteresis effects in cycling force 
experiments are particularly strong in the vicinity of the 
critical point. 

Force-extension experiments clearly probe the energy 
landscape of RNA. However, it is difficult to relate the 
statistics of the jump events, and in particular the over- 
lap between the secondary structures before and after the 
jump, to global properties of the energy landscape. If the 
latter is characterized by a unique size-dependent energy 
scale E(N) ~ N e , as proposed in ]2q , pl| , one might con- 
jecture that at low forces the probability distribution of 
large rearrangements in a chain of length N follows a scal- 
ing law P N (q) = E(N)P((l-q)E(N)). However, the data 
is not consistent with such a simple ansatz. This is prob- 
ably due to the fact that the force does not couple to 
the secondary structure via a bulk perturbation (as in the 
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Fig. 11. Overlap between the secondary structure at force / 
and the ground state (/ = 0) for five sequences with 1800 
bases in the four letters model. The amplitudes of jumps are 
very important in the low force phase, where the behaviour 
is strongly sequence-dependent. In the high force regime the 
overlap with the ground state is not particularly meaningful. 
Very stable structures that have resisted any rearrangement up 
to that point are finally broken up by the force. This happens 
in an almost continuous manner since the individual struc- 
tures are their respective rupture forces are almost randomly 
distributed. 

e-coupling method used in ^,|l|), but only via the exten- 
sion. This seems to slightly favour successions of smaller 
rearrangements instead of single large jumps. 

Let us recall at this point that we completely neglected 
thermal fluctuations, and thus all jump events are in- 
finitely sharp and occur at well defined forces. In real- 
ity such transitions are smoothed out over a small force 
window, and if the density of rearrangements is high, the 
individual peaks cannot be distinguished anymore. This is 
true in particular for the extensive phase, and lies at the 
basis of the observed smoothness of experimental force- 
extension curves. In that regime it proves to be better 
to identify transitions between competing foldings via in- 
creased fluctuations Jl7],[l8|]. However, at low forces, ther- 
mal effects are far less important since the rearrangements 
are rare, and we expect the present analysis to give a 
good qualitative picture of typical experimental curves. Of 
course, itdoes not reproduce correctly the detailed charac- 
teristics of force-extension curves, but it provide a rather 
quantitative insight into the nature of the energy land- 
scape as described by the size and frequency of occurrence 
of rearrangements, in particular in the low force regime. 



transition have been analysed in detail and the phases 
at low and high force have been characterized in terms 
of their different structural properties and response to the 
force. The critical behaviour of the opening transition was 
found to be modified with respect to the homogeneous 
case and strong evidence for the universality of different 
disorder models has been provided. The extrapolation of 
scaling laws to vanishing force has allowed to determine 
the scaling of several observables with TV in a clean man- 
ner. 

The statistics of rearrangement events at low forces 
have been characterized. While in the thermodynamic limit 
global macroscopic rearrangements of the secondary struc- 
ture are ruled out since their probability decreases as jV~°- 2! 
macroscopic jump events remain possible up to quite large 
system sizes. 

From the experimental point of view the low force 
regime is certainly the most interesting since the trans- 
formations in the secondary structure required to reach 
equilibrium are rather complex and will proceed in sev- 
eral steps, possibly passing through metastable interme- 
diates. It will be interesting to observe the approach to 
equilibrium after slightly changing the force to a new fixed 
value. Alternatively, cycling force experiments should ex- 
hibit interesting hysteresis effects as have been observed in 
the closely related DNA un- and rezipping ||. The crit- 
ical regime is particularly suited for this kind of exper- 
iments since the overlap between equilibrium secondary 
structures decreases very rapidly as the force is varied. 

Such experiments might also shed some light on the en- 
ergetic barriers B(N) between metastable states in glob- 
ules of size N. Those are related to typical equilibration 
times t(N) via r(N) ~ exp[B(N)/kT}. On the theoretical 
side, those barriers are not very well understood yet. In 
the spirit of the scaling picture |^,^lj, one would expect 
B(N) ~ E(N) ~ N e , with 9 « 0.15 - 0.35, whereas the 
analysis of barriers in the ground state ensemble of a de- 

§ierate toy model |37j as well as dynamic simulations [ j38[ 
of RNA-folding rather point towards barriers growing 
like B(N) ~ N 1 ^ 2 . Further work is needed to clarify this 
important issue. 
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